<!-- build time:Wed Jun 21 2023 22:33:35 GMT+0800 (GMT+08:00) --><!DOCTYPE html><html lang="zh-CN"><head><meta charset="UTF-8"><meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=2"><meta name="theme-color" content="#FFF"><meta name="baidu-site-verification" content="code-C0oocRvMWv"><link rel="apple-touch-icon" sizes="180x180" href="/images/apple-touch-icon.png"><link rel="icon" type="image/ico" sizes="32x32" href="/images/favicon.ico"><link rel="mask-icon" href="/images/logo.svg" color=""><link rel="manifest" href="/images/manifest.json"><meta name="msapplication-config" content="/images/browserconfig.xml"><meta http-equiv="Cache-Control" content="no-transform"><meta http-equiv="Cache-Control" content="no-siteapp"><meta name="baidu-site-verification" content="https://jiang-hs.gitee.io"><link rel="alternate" type="application/rss+xml" title="航 順" href="https://jiang-hs.gitee.io/rss.xml"><link rel="alternate" type="application/atom+xml" title="航 順" href="https://jiang-hs.gitee.io/atom.xml"><link rel="alternate" type="application/json" title="航 順" href="https://jiang-hs.gitee.io/feed.json"><link rel="stylesheet" href="//fonts.googleapis.com/css?family=Mulish:300,300italic,400,400italic,700,700italic%7CFredericka%20the%20Great:300,300italic,400,400italic,700,700italic%7CNoto%20Serif%20JP:300,300italic,400,400italic,700,700italic%7CNoto%20Serif%20SC:300,300italic,400,400italic,700,700italic%7CInconsolata:300,300italic,400,400italic,700,700italic&display=swap&subset=latin,latin-ext"><link rel="stylesheet" href="/css/app.css?v=0.0.0"><meta name="keywords" content="人工智能,机器学习基础"><link rel="canonical" href="https://jiang-hs.gitee.io/posts/202f1f0f/"><meta name="description" content="# 一、梯度下降 # 1. 梯度下降简介 梯度下降法（gradient descent）是一个最优化算法，也称为最速下降法（steepest descent）。常用于机器学习和人工智能当中用来递归性地逼近最小偏差模型。寻找极小点时，从负梯度的方向寻找。 # 2. 核心公式 θi&#x3D;θi−α∂∂θiJ(θ)θ_i&#x3D;θ_i−α\frac{∂}{∂θ_i}J(θ) θi​&#x3D;θi​−α∂θi​∂​J(θ)"><meta property="og:type" content="article"><meta property="og:title" content="梯度下降及线性回归"><meta property="og:url" content="https://jiang-hs.gitee.io/posts/202f1f0f/index.html"><meta property="og:site_name" content="航 順"><meta property="og:description" content="# 一、梯度下降 # 1. 梯度下降简介 梯度下降法（gradient descent）是一个最优化算法，也称为最速下降法（steepest descent）。常用于机器学习和人工智能当中用来递归性地逼近最小偏差模型。寻找极小点时，从负梯度的方向寻找。 # 2. 核心公式 θi&#x3D;θi−α∂∂θiJ(θ)θ_i&#x3D;θ_i−α\frac{∂}{∂θ_i}J(θ) θi​&#x3D;θi​−α∂θi​∂​J(θ)"><meta property="og:locale" content="zh_CN"><meta property="og:image" content="https://jiang-hs.github.io/post-images/1590134245318.png"><meta property="og:image" content="https://jiang-hs.github.io/post-images/1590134254515.png"><meta property="og:image" content="https://jiang-hs.github.io/post-images/1590134259162.png"><meta property="og:image" content="https://jiang-hs.github.io/post-images/1590134267713.png#pic_l"><meta property="og:image" content="https://jiang-hs.github.io/post-images/1590134271990.png#pic_r"><meta property="article:published_time" content="2021-03-06T07:21:13.000Z"><meta property="article:modified_time" content="2021-08-25T03:32:03.796Z"><meta property="article:author" content="hang shun"><meta property="article:tag" content="人工智能"><meta property="article:tag" content="机器学习基础"><meta name="twitter:card" content="summary"><meta name="twitter:image" content="https://jiang-hs.github.io/post-images/1590134245318.png"><title>梯度下降及线性回归 - 机器学习基础 | hang shun = 航 順 = 天官赐福，百无禁忌</title><meta name="generator" content="Hexo 5.4.2"></head><body itemscope itemtype="http://schema.org/WebPage"><div id="loading"><div class="cat"><div class="body"></div><div class="head"><div class="face"></div></div><div class="foot"><div class="tummy-end"></div><div class="bottom"></div><div class="legs left"></div><div class="legs right"></div></div><div class="paw"><div class="hands left"></div><div class="hands right"></div></div></div></div><div id="container"><header id="header" itemscope itemtype="http://schema.org/WPHeader"><div class="inner"><div id="brand"><div class="pjax"><h1 itemprop="name headline">梯度下降及线性回归</h1><div class="meta"><span class="item" title="创建时间：2021-03-06 15:21:13"><span class="icon"><i class="ic i-calendar"></i> </span><span class="text">发表于</span> <time itemprop="dateCreated datePublished" datetime="2021-03-06T15:21:13+08:00">2021-03-06</time> </span><span class="item" title="本文字数"><span class="icon"><i class="ic i-pen"></i> </span><span class="text">本文字数</span> <span>7.4k</span> <span class="text">字</span> </span><span class="item" title="阅读时长"><span class="icon"><i class="ic i-clock"></i> </span><span class="text">阅读时长</span> <span>7 分钟</span></span></div></div></div><nav id="nav"><div class="inner"><div class="toggle"><div class="lines" aria-label="切换导航栏"><span class="line"></span> <span class="line"></span> <span class="line"></span></div></div><ul class="menu"><li class="item title"><a href="/" rel="start">hang shun</a></li></ul><ul class="right"><li class="item theme"><i class="ic i-sun"></i></li><li class="item search"><i class="ic i-search"></i></li></ul></div></nav></div><div id="imgs" class="pjax"><ul><li class="item" data-background-image="https://pic1.imgdb.cn/item/60d7f9475132923bf8a8a29b.jpg"></li><li class="item" data-background-image="https://pic1.imgdb.cn/item/64427c390d2dde5777afabd1.jpg"></li><li class="item" data-background-image="https://pic1.imgdb.cn/item/60d7f9855132923bf8a9f1ba.jpg"></li><li class="item" data-background-image="https://pic1.imgdb.cn/item/60d7f9cb5132923bf8ab5c4b.jpg"></li><li class="item" data-background-image="https://pic1.imgdb.cn/item/64427ca10d2dde5777b04c1f.png"></li><li class="item" data-background-image="https://pic1.imgdb.cn/item/60d7f98f5132923bf8aa2419.jpg"></li></ul></div></header><div id="waves"><svg class="waves" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" viewBox="0 24 150 28" preserveAspectRatio="none" shape-rendering="auto"><defs><path id="gentle-wave" d="M-160 44c30 0 58-18 88-18s 58 18 88 18 58-18 88-18 58 18 88 18 v44h-352z"/></defs><g class="parallax"><use xlink:href="#gentle-wave" x="48" y="0"/><use xlink:href="#gentle-wave" x="48" y="3"/><use xlink:href="#gentle-wave" x="48" y="5"/><use xlink:href="#gentle-wave" x="48" y="7"/></g></svg></div><main><div class="inner"><div id="main" class="pjax"><div class="article wrap"><div class="breadcrumb" itemscope itemtype="https://schema.org/BreadcrumbList"><i class="ic i-home"></i> <span><a href="/">首页</a></span><i class="ic i-angle-right"></i> <span class="current" itemprop="itemListElement" itemscope itemtype="https://schema.org/ListItem"><a href="/categories/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E5%9F%BA%E7%A1%80/" itemprop="item" rel="index" title="分类于 机器学习基础"><span itemprop="name">机器学习基础</span></a><meta itemprop="position" content="1"></span></div><article itemscope itemtype="http://schema.org/Article" class="post block" lang="zh-CN"><link itemprop="mainEntityOfPage" href="https://jiang-hs.gitee.io/posts/202f1f0f/"><span hidden itemprop="author" itemscope itemtype="http://schema.org/Person"><meta itemprop="image" content="/images/avatar.jpg"><meta itemprop="name" content="hang shun"><meta itemprop="description" content="天官赐福，百无禁忌, 世中逢尔，雨中逢花"></span><span hidden itemprop="publisher" itemscope itemtype="http://schema.org/Organization"><meta itemprop="name" content="航 順"></span><div class="body md" itemprop="articleBody"><h1 id="一-梯度下降"><a class="anchor" href="#一-梯度下降">#</a> 一、梯度下降</h1><h2 id="1梯度下降简介"><a class="anchor" href="#1梯度下降简介">#</a> 1. 梯度下降简介</h2><p>梯度下降法（gradient descent）是一个最优化算法，也称为最速下降法（steepest descent）。常用于机器学习和人工智能当中用来递归性地逼近最小偏差模型。寻找极小点时，从负梯度的方向寻找。</p><h2 id="2核心公式"><a class="anchor" href="#2核心公式">#</a> 2. 核心公式</h2><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mrow><msub><mi>θ</mi><mi>i</mi></msub><mo>=</mo><msub><mi>θ</mi><mi>i</mi></msub><mtext>−</mtext><mi>α</mi><mfrac><mi mathvariant="normal">∂</mi><mrow><mi mathvariant="normal">∂</mi><msub><mi>θ</mi><mi>i</mi></msub></mrow></mfrac><mi>J</mi><mo stretchy="false">(</mo><mi>θ</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">θ_i=θ_i−α\frac{∂}{∂θ_i}J(θ)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.84444em;vertical-align:-.15em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.31166399999999994em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:2.20744em;vertical-align:-.8360000000000001em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.31166399999999994em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mord">−</span><span class="mord mathnormal" style="margin-right:.0037em">α</span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.37144em"><span style="top:-2.3139999999999996em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord" style="margin-right:.05556em">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.31166399999999994em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord" style="margin-right:.05556em">∂</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.8360000000000001em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mord mathnormal" style="margin-right:.09618em">J</span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="mclose">)</span></span></span></span></span></p><p>其中<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>α</mi></mrow><annotation encoding="application/x-tex">α</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.43056em;vertical-align:0"></span><span class="mord mathnormal" style="margin-right:.0037em">α</span></span></span></span> 为步长，也叫学习率。</p><h2 id="3梯度下降法的一般步骤"><a class="anchor" href="#3梯度下降法的一般步骤">#</a> 3. 梯度下降法的一般步骤</h2><p>假设函数 $ y = f (x_1,x_2,...,x_n)$ 只有一个极小点。<br>初始给定参数为 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>X</mi><mn>0</mn></msub><mo>=</mo><mo stretchy="false">(</mo><msub><mi>x</mi><mn>1</mn></msub><mn>0</mn><mo separator="true">,</mo><msub><mi>x</mi><mn>2</mn></msub><mn>0</mn><mo separator="true">,</mo><mi mathvariant="normal">.</mi><mi mathvariant="normal">.</mi><mi mathvariant="normal">.</mi><mo separator="true">,</mo><msub><mi>x</mi><mi>n</mi></msub><mn>0</mn><mo separator="true">,</mo><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">X_0 = (x_10,x_20,...,x_n0,)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.83333em;vertical-align:-.15em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.07847em">X</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.07847em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-.25em"></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mord">0</span><span class="mpunct">,</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mord">0</span><span class="mpunct">,</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord">.</span><span class="mord">.</span><span class="mord">.</span><span class="mpunct">,</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.151392em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mord">0</span><span class="mpunct">,</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mclose">)</span></span></span></span>。从这个点如何搜索才能找到原函数的极小值点？<br>方法：<br>①首先设定一个较小的正数<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>α</mi></mrow><annotation encoding="application/x-tex">α</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.43056em;vertical-align:0"></span><span class="mord mathnormal" style="margin-right:.0037em">α</span></span></span></span>，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>ε</mi></mrow><annotation encoding="application/x-tex">ε</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.43056em;vertical-align:0"></span><span class="mord mathnormal">ε</span></span></span></span>；<br>②求当前位置出处的各个偏导数：</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mrow><msup><mi>f</mi><mo mathvariant="normal" lspace="0em" rspace="0em">′</mo></msup><mo stretchy="false">(</mo><msub><mi>x</mi><mrow><mi>m</mi><mn>0</mn></mrow></msub><mo stretchy="false">)</mo><mo>=</mo><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>y</mi></mrow><mrow><mi mathvariant="normal">∂</mi><msub><mi>x</mi><mi>m</mi></msub></mrow></mfrac><mo stretchy="false">(</mo><msub><mi>x</mi><mrow><mi>m</mi><mn>0</mn></mrow></msub><mo stretchy="false">)</mo><mo separator="true">,</mo><mi>m</mi><mo>=</mo><mn>1</mn><mo separator="true">,</mo><mn>2</mn><mo separator="true">,</mo><mi mathvariant="normal">.</mi><mi mathvariant="normal">.</mi><mi mathvariant="normal">.</mi><mo separator="true">,</mo><mi>n</mi></mrow><annotation encoding="application/x-tex">f&#x27;(x_{m0})=\frac{∂y}{∂x_{m}}(x_{m0}),m=1,2,...,n</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.051892em;vertical-align:-.25em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.10764em">f</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:.801892em"><span style="top:-3.113em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">′</span></span></span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">m</span><span class="mord mtight">0</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:2.20744em;vertical-align:-.8360000000000001em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714399999999998em"><span style="top:-2.3139999999999996em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord" style="margin-right:.05556em">∂</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.151392em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">m</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord" style="margin-right:.05556em">∂</span><span class="mord mathnormal" style="margin-right:.03588em">y</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.8360000000000001em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">m</span><span class="mord mtight">0</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mpunct">,</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord mathnormal">m</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:.8388800000000001em;vertical-align:-.19444em"></span><span class="mord">1</span><span class="mpunct">,</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord">2</span><span class="mpunct">,</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord">.</span><span class="mord">.</span><span class="mord">.</span><span class="mpunct">,</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord mathnormal">n</span></span></span></span></span></p><p>③修改当前函数的参数值，公式如下：</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mrow><msubsup><mi>x</mi><mi>m</mi><mo mathvariant="normal" lspace="0em" rspace="0em">′</mo></msubsup><mo>=</mo><msub><mi>x</mi><mi>m</mi></msub><mo>−</mo><mi>α</mi><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>y</mi></mrow><mrow><mi mathvariant="normal">∂</mi><msub><mi>x</mi><mi>m</mi></msub></mrow></mfrac><mo stretchy="false">(</mo><msub><mi>x</mi><mrow><mi>m</mi><mn>0</mn></mrow></msub><mo stretchy="false">)</mo><mo separator="true">,</mo><mi>m</mi><mo>=</mo><mn>1</mn><mo separator="true">,</mo><mn>2</mn><mo separator="true">,</mo><mi mathvariant="normal">.</mi><mi mathvariant="normal">.</mi><mi mathvariant="normal">.</mi><mo separator="true">,</mo><mi>n</mi></mrow><annotation encoding="application/x-tex">x&#x27;_{m}=x_{m}-α\frac{∂y}{∂x_{m}}(x_{m0}),m=1,2,...,n</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.048892em;vertical-align:-.247em"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.8018919999999999em"><span style="top:-2.4530000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">m</span></span></span></span><span style="top:-3.113em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">′</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.247em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:.73333em;vertical-align:-.15em"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.151392em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">m</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:2.20744em;vertical-align:-.8360000000000001em"></span><span class="mord mathnormal" style="margin-right:.0037em">α</span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714399999999998em"><span style="top:-2.3139999999999996em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord" style="margin-right:.05556em">∂</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.151392em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">m</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord" style="margin-right:.05556em">∂</span><span class="mord mathnormal" style="margin-right:.03588em">y</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.8360000000000001em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">m</span><span class="mord mtight">0</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mpunct">,</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord mathnormal">m</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:.8388800000000001em;vertical-align:-.19444em"></span><span class="mord">1</span><span class="mpunct">,</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord">2</span><span class="mpunct">,</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord">.</span><span class="mord">.</span><span class="mord">.</span><span class="mpunct">,</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord mathnormal">n</span></span></span></span></span></p><p>④如果参数变化量小于<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>ε</mi></mrow><annotation encoding="application/x-tex">ε</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.43056em;vertical-align:0"></span><span class="mord mathnormal">ε</span></span></span></span>，退出；否则返回第 2 步。</p><h2 id="4梯度下降存在的问题"><a class="anchor" href="#4梯度下降存在的问题">#</a> 4. 梯度下降存在的问题：</h2><ul><li>当靠近极小值时收敛速度减慢；</li><li>直线搜索时可能会产生一些问题；</li><li>下降过程可能会出现 “之字形” 地下降。</li></ul><h2 id="5梯度下降三兄弟bgdsgdmbgd"><a class="anchor" href="#5梯度下降三兄弟bgdsgdmbgd">#</a> 5. 梯度下降三兄弟（BGD，SGD，MBGD）</h2><h3 id="1批量梯度下降法batch-gradient-descent"><a class="anchor" href="#1批量梯度下降法batch-gradient-descent">#</a> ①批量梯度下降法（Batch Gradient Descent）</h3><p>批量梯度下降法每次都使用训练集中的所有样本更新参数。它得到的是一个全局最优解，但是每迭代一步，都要用到训练集所有的数据，如果 m 很大，那么迭代速度就会变得很慢。<br>优点：可以得出全局最优解。<br>缺点：样本数据集大时，训练速度慢。</p><h3 id="2随机梯度下降法stochastic-gradient-descent"><a class="anchor" href="#2随机梯度下降法stochastic-gradient-descent">#</a> ②随机梯度下降法（Stochastic Gradient Descent）</h3><p>随机梯度下降法每次更新都从样本随机选择 1 组数据，因此随机梯度下降比批量梯度下降在计算量上会大大减少。SGD 有一个缺点是，其噪音较 BGD 要多，使得 SGD 并不是每次迭代都向着整体最优化方向。而且 SGD 因为每次都是使用一个样本进行迭代，因此最终求得的最优解往往不是全局最优解，而只是局部最优解。但是大的整体的方向是向全局最优解的，最终的结果往往是在全局最优解附近。<br>优点：训练速度较快。<br>缺点：过程杂乱，准确度下降。</p><h3 id="3小批量梯度下降法mini-batch-gradient-descent"><a class="anchor" href="#3小批量梯度下降法mini-batch-gradient-descent">#</a> ③小批量梯度下降法（Mini-batch Gradient Descent）</h3><p>小批量梯度下降法对包含 n 个样本的数据集进行计算。综合了上述两种方法，既保证了训练速度快，又保证了准确度。</p><h1 id="二-线性回归"><a class="anchor" href="#二-线性回归">#</a> 二、线性回归</h1><h2 id="1线性回归简介"><a class="anchor" href="#1线性回归简介">#</a> 1. 线性回归简介</h2><p>线性回归是利用数理统计中的回归分析，来确定两种或两种以上变量间相互依赖的定量关系的一种统计分析方法，运用十分广泛。<br>回归分析中，如果只包括一个自变量和一个因变量，且二者的关系可用一条直线近似表示，这种回归分析称为<strong>一元线性回归分析</strong>；如果回归分析中包括两个或两个以上的自变量，且因变量和自变量之间是线性关系，则称为<strong>多元线性回归分析</strong>。</p><h2 id="2线性回归分析的一般形式"><a class="anchor" href="#2线性回归分析的一般形式">#</a> 2. 线性回归分析的一般形式</h2><p x="">一元线性回归分析：<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>Y</mi><mo>=</mo><mi>a</mi><mo>+</mo><mi>b</mi><mi>x</mi></mrow><annotation encoding="application/x-tex">Y=a+bx</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.68333em;vertical-align:0"></span><span class="mord mathnormal" style="margin-right:.22222em">Y</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:.66666em;vertical-align:-.08333em"></span><span class="mord mathnormal">a</span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:.69444em;vertical-align:0"></span><span class="mord mathnormal">b</span><span class="mord mathnormal">x</span></span></span></span><br>多元线性回归分析：<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>Y</mi><mo>=</mo><mi>a</mi><mo>+</mo><msub><mi>b</mi><mn>1</mn></msub><msub><mi>X</mi><mn>1</mn></msub><mo>+</mo><msub><mi>b</mi><mn>2</mn></msub><msub><mi>X</mi><mn>2</mn></msub><mo>+</mo><mo separator="true">⋅</mo><mo separator="true">⋅</mo><mo separator="true">⋅</mo><mo>+</mo><msub><mi>b</mi><mi>k</mi></msub><msub><mi>X</mi><mi>k</mi></msub></mrow><annotation encoding="application/x-tex">Y=a+b_{1}X_{1}+b_{2}X_{2}+···+b_{k}X_{k}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.68333em;vertical-align:0"></span><span class="mord mathnormal" style="margin-right:.22222em">Y</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:.66666em;vertical-align:-.08333em"></span><span class="mord mathnormal">a</span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:.84444em;vertical-align:-.15em"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:.07847em">X</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.07847em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:.84444em;vertical-align:-.15em"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">2</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:.07847em">X</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.07847em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">2</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mord">+</span><span class="mpunct">⋅</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mpunct">⋅</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mpunct">⋅</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord">+</span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.33610799999999996em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:.03148em">k</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:.07847em">X</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.33610799999999996em"><span style="top:-2.5500000000000003em;margin-left:-.07847em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:.03148em">k</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span></span></span></span><br>渐进回归模型：<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>Y</mi><mo>=</mo><mi>a</mi><mo>+</mo><mi>b</mi><msup><mi>e</mi><mrow><mo>−</mo><mi>r</mi><mi>X</mi></mrow></msup></mrow><annotation encoding="application/x-tex">Y=a+be^{-rX}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.68333em;vertical-align:0"></span><span class="mord mathnormal" style="margin-right:.22222em">Y</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:.66666em;vertical-align:-.08333em"></span><span class="mord mathnormal">a</span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:.8413309999999999em;vertical-align:0"></span><span class="mord mathnormal">b</span><span class="mord"><span class="mord mathnormal">e</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:.8413309999999999em"><span style="top:-3.063em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">−</span><span class="mord mathnormal mtight" style="margin-right:.02778em">r</span><span class="mord mathnormal mtight" style="margin-right:.07847em">X</span></span></span></span></span></span></span></span></span></span></span></span><br>二次曲线回归模型：<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>Y</mi><mo>=</mo><mi>a</mi><mo>+</mo><msub><mi>b</mi><mn>1</mn></msub><mi>X</mi><mo>+</mo><msub><mi>b</mi><mn>2</mn></msub><msup><mi>X</mi><mn>2</mn></msup></mrow><annotation encoding="application/x-tex">Y=a+b_{1}X+b_{2}X^{2}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.68333em;vertical-align:0"></span><span class="mord mathnormal" style="margin-right:.22222em">Y</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:.66666em;vertical-align:-.08333em"></span><span class="mord mathnormal">a</span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:.84444em;vertical-align:-.15em"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mord mathnormal" style="margin-right:.07847em">X</span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:.964108em;vertical-align:-.15em"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">2</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:.07847em">X</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:.8141079999999999em"><span style="top:-3.063em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span></span></span></span></span><br>双曲线模型：Y=a+\frac{b}</p><p>除此之外还有其他的非线性回归分析的一般形式，这里不再赘述。</p><h2 id="3回归分析可以解决的问题"><a class="anchor" href="#3回归分析可以解决的问题">#</a> 3. 回归分析可以解决的问题</h2><ul><li>建立变量间的数学表达式，通常称为经验公式。</li><li>利用概率统计基础知识进行分析，从而判断所建立的经验公式的有效性。</li><li>进行因素分析，确定影响某一变量的若干变量（因素）中，何者为主要，何者为次要，以及它们之间的关系（比如：房屋价格的预测）。</li></ul><h1 id="三-一元线性回归函数推导过程"><a class="anchor" href="#三-一元线性回归函数推导过程">#</a> 三、一元线性回归函数推导过程</h1><p>设线性回归函数：<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>h</mi><mi>θ</mi></msub><mo stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo>=</mo><msub><mi>θ</mi><mn>0</mn></msub><mo>+</mo><msub><mi>θ</mi><mn>1</mn></msub><mi>x</mi></mrow><annotation encoding="application/x-tex">h_θ(x)=θ_0+θ_1x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-.25em"></span><span class="mord"><span class="mord mathnormal">h</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.33610799999999996em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:.02778em">θ</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord mathnormal">x</span><span class="mclose">)</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:.84444em;vertical-align:-.15em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:.84444em;vertical-align:-.15em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mord mathnormal">x</span></span></span></span><br>构造损失函数（loss）：<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>J</mi><mo stretchy="false">(</mo><msub><mi>θ</mi><mn>0</mn></msub><mo separator="true">,</mo><msub><mi>θ</mi><mn>1</mn></msub><mo stretchy="false">)</mo><mo>=</mo><mfrac><mn>1</mn><mrow><mn>2</mn><mi>m</mi></mrow></mfrac><msubsup><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mi>m</mi></msubsup><mo stretchy="false">(</mo><msub><mi>h</mi><mi>θ</mi></msub><mo stretchy="false">(</mo><msup><mi>x</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo stretchy="false">)</mo><mo>−</mo><msup><mi>y</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><msup><mo stretchy="false">)</mo><mn>2</mn></msup></mrow><annotation encoding="application/x-tex">J(θ_{0},θ_{1})=\frac{1}{2m}\sum_{i=1}^{m}(h_{θ}(x^{(i)})-y^{(i)})^{2}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-.25em"></span><span class="mord mathnormal" style="margin-right:.09618em">J</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">0</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:1.2329999999999999em;vertical-align:-.345em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.845108em"><span style="top:-2.6550000000000002em"><span class="pstrut" style="height:3em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">2</span><span class="mord mathnormal mtight">m</span></span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:.04em"></span></span><span style="top:-3.394em"><span class="pstrut" style="height:3em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.345em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mop"><span class="mop op-symbol small-op" style="position:relative;top:-.0000050000000000050004em">∑</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.804292em"><span style="top:-2.40029em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="mrel mtight">=</span><span class="mord mtight">1</span></span></span></span><span style="top:-3.2029em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">m</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.29971000000000003em"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">h</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.33610799999999996em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:.02778em">θ</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:.8879999999999999em"><span style="top:-3.063em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose">)</span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:1.138em;vertical-align:-.25em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.03588em">y</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:.8879999999999999em"><span style="top:-3.063em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:.8141079999999999em"><span style="top:-3.063em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span></span></span></span></span><br>思路：通过梯度下降法不断更新<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>θ</mi><mn>0</mn></msub></mrow><annotation encoding="application/x-tex">θ_{0}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.84444em;vertical-align:-.15em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">0</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span></span></span></span> 和<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>θ</mi><mn>1</mn></msub></mrow><annotation encoding="application/x-tex">θ_{1}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.84444em;vertical-align:-.15em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span></span></span></span>​，当损失函数的值特别小时，就得到了我们最终的函数模型。<br>过程：</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mrow><mi>J</mi><mo stretchy="false">(</mo><msub><mi>θ</mi><mn>0</mn></msub><mo separator="true">,</mo><msub><mi>θ</mi><mn>1</mn></msub><mo stretchy="false">)</mo><mo>=</mo><mfrac><mn>1</mn><mrow><mn>2</mn><mi>m</mi></mrow></mfrac><munderover><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mi>m</mi></munderover><mo stretchy="false">(</mo><msub><mi>h</mi><mi>θ</mi></msub><mo stretchy="false">(</mo><msup><mi>x</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo stretchy="false">)</mo><mo>−</mo><msup><mi>y</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><msup><mo stretchy="false">)</mo><mn>2</mn></msup></mrow><annotation encoding="application/x-tex">J(θ_{0},θ_{1})=\frac{1}{2m}\sum_{i=1}^{m}(h_{θ}(x^{(i)})-y^{(i)})^{2}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-.25em"></span><span class="mord mathnormal" style="margin-right:.09618em">J</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">0</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:2.929066em;vertical-align:-1.277669em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.32144em"><span style="top:-2.314em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord">2</span><span class="mord mathnormal">m</span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.686em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.6513970000000002em"><span style="top:-1.872331em;margin-left:0"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="mrel mtight">=</span><span class="mord mtight">1</span></span></span></span><span style="top:-3.050005em"><span class="pstrut" style="height:3.05em"></span><span><span class="mop op-symbol large-op">∑</span></span></span><span style="top:-4.3000050000000005em;margin-left:0"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">m</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.277669em"><span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">h</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.33610799999999996em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:.02778em">θ</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:.938em"><span style="top:-3.113em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose">)</span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:1.188em;vertical-align:-.25em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.03588em">y</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:.938em"><span style="top:-3.113em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:.8641079999999999em"><span style="top:-3.113em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span></span></span></span></span></span></p><p>step1. 求导：</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mrow><mfrac><mi mathvariant="normal">∂</mi><mrow><mi mathvariant="normal">∂</mi><msub><mi>θ</mi><mn>0</mn></msub></mrow></mfrac><mi>J</mi><mo stretchy="false">(</mo><msub><mi>θ</mi><mn>0</mn></msub><mo separator="true">,</mo><msub><mi>θ</mi><mn>1</mn></msub><mo stretchy="false">)</mo><mo>=</mo><mfrac><mn>1</mn><mi>m</mi></mfrac><munderover><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mi>m</mi></munderover><mo stretchy="false">(</mo><msub><mi>h</mi><mi>θ</mi></msub><mo stretchy="false">(</mo><mi>x</mi><msup><mo stretchy="false">)</mo><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo>−</mo><msup><mi>y</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">\frac{∂}{∂θ_{0}}J(θ_{0},θ_{1})=\frac{1}{m}\sum_{i=1} ^{m}(h_{θ}(x)^{(i)}-y^{(i)})</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.20744em;vertical-align:-.8360000000000001em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.37144em"><span style="top:-2.3139999999999996em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord" style="margin-right:.05556em">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">0</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord" style="margin-right:.05556em">∂</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.8360000000000001em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mord mathnormal" style="margin-right:.09618em">J</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">0</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:2.929066em;vertical-align:-1.277669em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.32144em"><span style="top:-2.314em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord mathnormal">m</span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.686em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.6513970000000002em"><span style="top:-1.872331em;margin-left:0"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="mrel mtight">=</span><span class="mord mtight">1</span></span></span></span><span style="top:-3.050005em"><span class="pstrut" style="height:3.05em"></span><span><span class="mop op-symbol large-op">∑</span></span></span><span style="top:-4.3000050000000005em;margin-left:0"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">m</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.277669em"><span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">h</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.33610799999999996em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:.02778em">θ</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord mathnormal">x</span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:.938em"><span style="top:-3.113em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:1.188em;vertical-align:-.25em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.03588em">y</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:.938em"><span style="top:-3.113em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span></span></p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mrow><mfrac><mi mathvariant="normal">∂</mi><mrow><mi mathvariant="normal">∂</mi><msub><mi>θ</mi><mn>1</mn></msub></mrow></mfrac><mi>J</mi><mo stretchy="false">(</mo><msub><mi>θ</mi><mn>0</mn></msub><mo separator="true">,</mo><msub><mi>θ</mi><mn>1</mn></msub><mo stretchy="false">)</mo><mo>=</mo><mfrac><mn>1</mn><mi>m</mi></mfrac><munderover><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mi>m</mi></munderover><mo stretchy="false">(</mo><msub><mi>h</mi><mi>θ</mi></msub><mo stretchy="false">(</mo><mi>x</mi><msup><mo stretchy="false">)</mo><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo>−</mo><msup><mi>y</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo stretchy="false">)</mo><mo separator="true">⋅</mo><msup><mi>x</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">\frac{∂}{∂θ_{1}}J(θ_{0},θ_{1})=\frac{1}{m}\sum_{i=1} ^{m}(h_{θ}(x)^{(i)}-y^{(i)})·x^{(i)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.20744em;vertical-align:-.8360000000000001em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.37144em"><span style="top:-2.3139999999999996em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord" style="margin-right:.05556em">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord" style="margin-right:.05556em">∂</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.8360000000000001em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mord mathnormal" style="margin-right:.09618em">J</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">0</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:2.929066em;vertical-align:-1.277669em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.32144em"><span style="top:-2.314em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord mathnormal">m</span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.686em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.6513970000000002em"><span style="top:-1.872331em;margin-left:0"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="mrel mtight">=</span><span class="mord mtight">1</span></span></span></span><span style="top:-3.050005em"><span class="pstrut" style="height:3.05em"></span><span><span class="mop op-symbol large-op">∑</span></span></span><span style="top:-4.3000050000000005em;margin-left:0"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">m</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.277669em"><span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">h</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.33610799999999996em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:.02778em">θ</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord mathnormal">x</span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:.938em"><span style="top:-3.113em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:1.188em;vertical-align:-.25em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.03588em">y</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:.938em"><span style="top:-3.113em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose">)</span><span class="mpunct">⋅</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:.938em"><span style="top:-3.113em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span></span></p><p>step2. 更新<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>θ</mi><mn>0</mn></msub></mrow><annotation encoding="application/x-tex">θ_{0}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.84444em;vertical-align:-.15em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">0</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span></span></span></span> 和<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>θ</mi><mn>1</mn></msub></mrow><annotation encoding="application/x-tex">θ_{1}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.84444em;vertical-align:-.15em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span></span></span></span>：</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mrow><msub><mi>θ</mi><mn>0</mn></msub><mo>=</mo><msub><mi>θ</mi><mn>0</mn></msub><mo>−</mo><mi>α</mi><mo separator="true">⋅</mo><mfrac><mn>1</mn><mi>m</mi></mfrac><munderover><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mi>m</mi></munderover><mo stretchy="false">(</mo><msub><mi>h</mi><mi>θ</mi></msub><mo stretchy="false">(</mo><msup><mi>x</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo stretchy="false">)</mo><mo>−</mo><msup><mi>y</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">θ_{0}=θ_{0}-α·\frac{1}{m}\sum_{i=1}^{m}(h_{θ}(x^{(i)})-y^{(i)})</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.84444em;vertical-align:-.15em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">0</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:.84444em;vertical-align:-.15em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">0</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:2.929066em;vertical-align:-1.277669em"></span><span class="mord mathnormal" style="margin-right:.0037em">α</span><span class="mpunct">⋅</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.32144em"><span style="top:-2.314em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord mathnormal">m</span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.686em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.6513970000000002em"><span style="top:-1.872331em;margin-left:0"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="mrel mtight">=</span><span class="mord mtight">1</span></span></span></span><span style="top:-3.050005em"><span class="pstrut" style="height:3.05em"></span><span><span class="mop op-symbol large-op">∑</span></span></span><span style="top:-4.3000050000000005em;margin-left:0"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">m</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.277669em"><span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">h</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.33610799999999996em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:.02778em">θ</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:.938em"><span style="top:-3.113em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose">)</span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:1.188em;vertical-align:-.25em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.03588em">y</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:.938em"><span style="top:-3.113em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span></span></p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mrow><msub><mi>θ</mi><mn>1</mn></msub><mo>=</mo><msub><mi>θ</mi><mn>1</mn></msub><mo>−</mo><mi>α</mi><mo separator="true">⋅</mo><mfrac><mn>1</mn><mi>m</mi></mfrac><munderover><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mi>m</mi></munderover><mo stretchy="false">(</mo><msub><mi>h</mi><mi>θ</mi></msub><mo stretchy="false">(</mo><msup><mi>x</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo stretchy="false">)</mo><mo>−</mo><msup><mi>y</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo stretchy="false">)</mo><mo separator="true">⋅</mo><msup><mi>x</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">θ_{1}=θ_{1}-α·\frac{1}{m}\sum_{i=1}^{m}(h_{θ}(x^{(i)})-y^{(i)})·x^{(i)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.84444em;vertical-align:-.15em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:.84444em;vertical-align:-.15em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.02778em">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.30110799999999993em"><span style="top:-2.5500000000000003em;margin-left:-.02778em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:2.929066em;vertical-align:-1.277669em"></span><span class="mord mathnormal" style="margin-right:.0037em">α</span><span class="mpunct">⋅</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.32144em"><span style="top:-2.314em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord mathnormal">m</span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.686em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.6513970000000002em"><span style="top:-1.872331em;margin-left:0"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="mrel mtight">=</span><span class="mord mtight">1</span></span></span></span><span style="top:-3.050005em"><span class="pstrut" style="height:3.05em"></span><span><span class="mop op-symbol large-op">∑</span></span></span><span style="top:-4.3000050000000005em;margin-left:0"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">m</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.277669em"><span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">h</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:.33610799999999996em"><span style="top:-2.5500000000000003em;margin-left:0;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:.02778em">θ</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:.15em"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:.938em"><span style="top:-3.113em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose">)</span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:1.188em;vertical-align:-.25em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:.03588em">y</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:.938em"><span style="top:-3.113em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose">)</span><span class="mpunct">⋅</span><span class="mspace" style="margin-right:.16666666666666666em"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:.938em"><span style="top:-3.113em;margin-right:.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span></span></p><p>step3. 代入损失函数，求损失函数的值，若得到的值小于<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>ε</mi></mrow><annotation encoding="application/x-tex">ε</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.43056em;vertical-align:0"></span><span class="mord mathnormal">ε</span></span></span></span>（一般为 0.01 或者 0.001 这样的小数），退出；否则，返回 step1。</p><h1 id="四-代码实现"><a class="anchor" href="#四-代码实现">#</a> 四、代码实现</h1><p>以下为通过梯度下降法实现一元线性回归的代码，分别使用了上述的 3 种梯度下降法：</p><h2 id="1批量梯度下降"><a class="anchor" href="#1批量梯度下降">#</a> 1. 批量梯度下降</h2><figure class="highlight python"><figcaption data-lang="python"></figcaption><table><tr><td data-num="1"></td><td><pre><span class="token keyword">import</span> matplotlib<span class="token punctuation">.</span>pyplot <span class="token keyword">as</span> plt</pre></td></tr><tr><td data-num="2"></td><td><pre><span class="token keyword">import</span> matplotlib</pre></td></tr><tr><td data-num="3"></td><td><pre><span class="token keyword">from</span> math <span class="token keyword">import</span> <span class="token builtin">pow</span></pre></td></tr><tr><td data-num="4"></td><td><pre><span class="token keyword">from</span> random <span class="token keyword">import</span> uniform</pre></td></tr><tr><td data-num="5"></td><td><pre></pre></td></tr><tr><td data-num="6"></td><td><pre>x <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">1</span><span class="token punctuation">,</span><span class="token number">3</span><span class="token punctuation">,</span><span class="token number">5</span><span class="token punctuation">,</span><span class="token number">7</span><span class="token punctuation">,</span><span class="token number">9</span><span class="token punctuation">,</span><span class="token number">11</span><span class="token punctuation">,</span><span class="token number">13</span><span class="token punctuation">]</span></pre></td></tr><tr><td data-num="7"></td><td><pre>y <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">100</span><span class="token punctuation">,</span><span class="token number">111</span><span class="token punctuation">,</span><span class="token number">130</span><span class="token punctuation">,</span><span class="token number">144</span><span class="token punctuation">,</span><span class="token number">149</span><span class="token punctuation">,</span><span class="token number">166</span><span class="token punctuation">,</span><span class="token number">179</span><span class="token punctuation">]</span></pre></td></tr><tr><td data-num="8"></td><td><pre></pre></td></tr><tr><td data-num="9"></td><td><pre><span class="token comment">#线性回归函数为 y=theta0+theta1*x</span></pre></td></tr><tr><td data-num="10"></td><td><pre><span class="token comment">#参数定义</span></pre></td></tr><tr><td data-num="11"></td><td><pre>theta0 <span class="token operator">=</span> uniform<span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">,</span><span class="token number">2</span><span class="token punctuation">)</span><span class="token comment">#对 theata0 随机赋值</span></pre></td></tr><tr><td data-num="12"></td><td><pre>theta1 <span class="token operator">=</span> uniform<span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">,</span><span class="token number">2</span><span class="token punctuation">)</span><span class="token comment">#对 theata1 随机赋值</span></pre></td></tr><tr><td data-num="13"></td><td><pre>alpha <span class="token operator">=</span> <span class="token number">0.1</span><span class="token comment">#学习率</span></pre></td></tr><tr><td data-num="14"></td><td><pre>m <span class="token operator">=</span> <span class="token builtin">len</span><span class="token punctuation">(</span>x<span class="token punctuation">)</span></pre></td></tr><tr><td data-num="15"></td><td><pre>count <span class="token operator">=</span> <span class="token number">0</span></pre></td></tr><tr><td data-num="16"></td><td><pre>loss <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token punctuation">]</span></pre></td></tr><tr><td data-num="17"></td><td><pre></pre></td></tr><tr><td data-num="18"></td><td><pre><span class="token keyword">for</span> num <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span><span class="token number">10000</span><span class="token punctuation">)</span><span class="token punctuation">:</span></pre></td></tr><tr><td data-num="19"></td><td><pre>    count <span class="token operator">+=</span> <span class="token number">1</span></pre></td></tr><tr><td data-num="20"></td><td><pre>    diss <span class="token operator">=</span> <span class="token number">0</span>   <span class="token comment">#误差</span></pre></td></tr><tr><td data-num="21"></td><td><pre>    deriv0 <span class="token operator">=</span> <span class="token number">0</span> <span class="token comment">#对 theata0 导数</span></pre></td></tr><tr><td data-num="22"></td><td><pre>    deriv1 <span class="token operator">=</span> <span class="token number">0</span> <span class="token comment">#对 theata1 导数</span></pre></td></tr><tr><td data-num="23"></td><td><pre>    <span class="token comment">#求导</span></pre></td></tr><tr><td data-num="24"></td><td><pre>    <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span>m<span class="token punctuation">)</span><span class="token punctuation">:</span></pre></td></tr><tr><td data-num="25"></td><td><pre>        deriv0 <span class="token operator">+=</span> <span class="token punctuation">(</span>theta0<span class="token operator">+</span>theta1<span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token operator">-</span>y<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token operator">/</span>m</pre></td></tr><tr><td data-num="26"></td><td><pre>        deriv1 <span class="token operator">+=</span> <span class="token punctuation">(</span><span class="token punctuation">(</span>theta0<span class="token operator">+</span>theta1<span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token operator">-</span>y<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token operator">/</span>m<span class="token punctuation">)</span><span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span></pre></td></tr><tr><td data-num="27"></td><td><pre>    <span class="token comment">#更新 theta0 和 theta1</span></pre></td></tr><tr><td data-num="28"></td><td><pre>    <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span>m<span class="token punctuation">)</span><span class="token punctuation">:</span></pre></td></tr><tr><td data-num="29"></td><td><pre>        theta0 <span class="token operator">=</span> theta0 <span class="token operator">-</span> alpha<span class="token operator">*</span><span class="token punctuation">(</span><span class="token punctuation">(</span>theta0<span class="token operator">+</span>theta1<span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token operator">-</span>y<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token operator">/</span>m<span class="token punctuation">)</span> </pre></td></tr><tr><td data-num="30"></td><td><pre>        theta1 <span class="token operator">=</span> theta1 <span class="token operator">-</span> alpha<span class="token operator">*</span><span class="token punctuation">(</span><span class="token punctuation">(</span>theta0<span class="token operator">+</span>theta1<span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token operator">-</span>y<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token operator">/</span>m<span class="token punctuation">)</span><span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span></pre></td></tr><tr><td data-num="31"></td><td><pre>    <span class="token comment">#求损失函数 J (θ)</span></pre></td></tr><tr><td data-num="32"></td><td><pre>    <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span>m<span class="token punctuation">)</span><span class="token punctuation">:</span></pre></td></tr><tr><td data-num="33"></td><td><pre>        diss <span class="token operator">=</span> diss <span class="token operator">+</span> <span class="token punctuation">(</span><span class="token number">1</span><span class="token operator">/</span><span class="token punctuation">(</span><span class="token number">2</span><span class="token operator">*</span>m<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">*</span><span class="token builtin">pow</span><span class="token punctuation">(</span><span class="token punctuation">(</span>theta0<span class="token operator">+</span>theta1<span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token operator">-</span>y<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token number">2</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="34"></td><td><pre>    loss<span class="token punctuation">.</span>append<span class="token punctuation">(</span>diss<span class="token punctuation">)</span></pre></td></tr><tr><td data-num="35"></td><td><pre></pre></td></tr><tr><td data-num="36"></td><td><pre>    <span class="token comment">#如果误差已经很小，则退出循环</span></pre></td></tr><tr><td data-num="37"></td><td><pre>    <span class="token keyword">if</span> diss <span class="token operator">&lt;=</span> <span class="token number">0.001</span><span class="token punctuation">:</span></pre></td></tr><tr><td data-num="38"></td><td><pre>        <span class="token keyword">break</span></pre></td></tr><tr><td data-num="39"></td><td><pre>    </pre></td></tr><tr><td data-num="40"></td><td><pre><span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string">"本次迭代次数为：&#123;&#125;次，最终得到theta0=&#123;&#125;，theta1=&#123;&#125;"</span><span class="token punctuation">.</span><span class="token builtin">format</span><span class="token punctuation">(</span>count<span class="token punctuation">,</span>theta0<span class="token punctuation">,</span>theta1<span class="token punctuation">)</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="41"></td><td><pre><span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string">"本次迭代得到的回归函数是：y=&#123;&#125;+&#123;&#125;*x"</span><span class="token punctuation">.</span><span class="token builtin">format</span><span class="token punctuation">(</span>theta0<span class="token punctuation">,</span>theta1<span class="token punctuation">)</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="42"></td><td><pre><span class="token comment">#画原始数据图和得到的线性回归函数图</span></pre></td></tr><tr><td data-num="43"></td><td><pre>matplotlib<span class="token punctuation">.</span>rcParams<span class="token punctuation">[</span><span class="token string">'font.sans-serif'</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token string">'SimHei'</span><span class="token punctuation">]</span></pre></td></tr><tr><td data-num="44"></td><td><pre>plt<span class="token punctuation">.</span>plot<span class="token punctuation">(</span>x<span class="token punctuation">,</span>y<span class="token punctuation">,</span><span class="token string">'bo'</span><span class="token punctuation">,</span>label<span class="token operator">=</span><span class="token string">'数据'</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="45"></td><td><pre>plt<span class="token punctuation">.</span>plot<span class="token punctuation">(</span>x<span class="token punctuation">,</span><span class="token punctuation">[</span>theta0<span class="token operator">+</span>theta1<span class="token operator">*</span>x <span class="token keyword">for</span> x <span class="token keyword">in</span> x<span class="token punctuation">]</span><span class="token punctuation">,</span>label<span class="token operator">=</span><span class="token string">'线性回归函数'</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="46"></td><td><pre>plt<span class="token punctuation">.</span>xlabel<span class="token punctuation">(</span><span class="token string">'x'</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="47"></td><td><pre>plt<span class="token punctuation">.</span>ylabel<span class="token punctuation">(</span><span class="token string">'y'</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="48"></td><td><pre>plt<span class="token punctuation">.</span>legend<span class="token punctuation">(</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="49"></td><td><pre>plt<span class="token punctuation">.</span>show<span class="token punctuation">(</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="50"></td><td><pre><span class="token comment">#画损失函数（误差）变化图</span></pre></td></tr><tr><td data-num="51"></td><td><pre>plt<span class="token punctuation">.</span>scatter<span class="token punctuation">(</span><span class="token builtin">range</span><span class="token punctuation">(</span>count<span class="token punctuation">)</span><span class="token punctuation">,</span>loss<span class="token punctuation">)</span></pre></td></tr><tr><td data-num="52"></td><td><pre>plt<span class="token punctuation">.</span>show<span class="token punctuation">(</span><span class="token punctuation">)</span></pre></td></tr></table></figure><p>输出结果：<br>本次迭代次数为：10000 次，最终得到<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi><mn>0</mn><mo>=</mo><mn>96.86808644268262</mn></mrow><annotation encoding="application/x-tex">theta0=96.86808644268262</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.69444em;vertical-align:0"></span><span class="mord mathnormal">t</span><span class="mord mathnormal">h</span><span class="mord mathnormal">e</span><span class="mord mathnormal">t</span><span class="mord mathnormal">a</span><span class="mord">0</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:.64444em;vertical-align:0"></span><span class="mord">9</span><span class="mord">6</span><span class="mord">.</span><span class="mord">8</span><span class="mord">6</span><span class="mord">8</span><span class="mord">0</span><span class="mord">8</span><span class="mord">6</span><span class="mord">4</span><span class="mord">4</span><span class="mord">2</span><span class="mord">6</span><span class="mord">8</span><span class="mord">2</span><span class="mord">6</span><span class="mord">2</span></span></span></span>，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi><mn>1</mn><mo>=</mo><mn>5.762687172041142</mn></mrow><annotation encoding="application/x-tex">theta1=5.762687172041142</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.69444em;vertical-align:0"></span><span class="mord mathnormal">t</span><span class="mord mathnormal">h</span><span class="mord mathnormal">e</span><span class="mord mathnormal">t</span><span class="mord mathnormal">a</span><span class="mord">1</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:.64444em;vertical-align:0"></span><span class="mord">5</span><span class="mord">.</span><span class="mord">7</span><span class="mord">6</span><span class="mord">2</span><span class="mord">6</span><span class="mord">8</span><span class="mord">7</span><span class="mord">1</span><span class="mord">7</span><span class="mord">2</span><span class="mord">0</span><span class="mord">4</span><span class="mord">1</span><span class="mord">1</span><span class="mord">4</span><span class="mord">2</span></span></span></span><br>本次迭代得到的回归函数是：<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>y</mi><mo>=</mo><mn>96.86808644268262</mn><mo>+</mo><mn>5.762687172041142</mn><mo>∗</mo><mi>x</mi></mrow><annotation encoding="application/x-tex">y=96.86808644268262+5.762687172041142*x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.625em;vertical-align:-.19444em"></span><span class="mord mathnormal" style="margin-right:.03588em">y</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:.72777em;vertical-align:-.08333em"></span><span class="mord">9</span><span class="mord">6</span><span class="mord">.</span><span class="mord">8</span><span class="mord">6</span><span class="mord">8</span><span class="mord">0</span><span class="mord">8</span><span class="mord">6</span><span class="mord">4</span><span class="mord">4</span><span class="mord">2</span><span class="mord">6</span><span class="mord">8</span><span class="mord">2</span><span class="mord">6</span><span class="mord">2</span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:.64444em;vertical-align:0"></span><span class="mord">5</span><span class="mord">.</span><span class="mord">7</span><span class="mord">6</span><span class="mord">2</span><span class="mord">6</span><span class="mord">8</span><span class="mord">7</span><span class="mord">1</span><span class="mord">7</span><span class="mord">2</span><span class="mord">0</span><span class="mord">4</span><span class="mord">1</span><span class="mord">1</span><span class="mord">4</span><span class="mord">2</span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">∗</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:.43056em;vertical-align:0"></span><span class="mord mathnormal">x</span></span></span></span><br><img data-src="https://jiang-hs.github.io/post-images/1590134245318.png" alt="img"></p><h2 id="2随机梯度下降"><a class="anchor" href="#2随机梯度下降">#</a> 2. 随机梯度下降</h2><figure class="highlight python"><figcaption data-lang="python"></figcaption><table><tr><td data-num="1"></td><td><pre><span class="token keyword">import</span> matplotlib<span class="token punctuation">.</span>pyplot <span class="token keyword">as</span> plt</pre></td></tr><tr><td data-num="2"></td><td><pre><span class="token keyword">import</span> matplotlib</pre></td></tr><tr><td data-num="3"></td><td><pre><span class="token keyword">from</span> math <span class="token keyword">import</span> <span class="token builtin">pow</span></pre></td></tr><tr><td data-num="4"></td><td><pre><span class="token keyword">import</span> random</pre></td></tr><tr><td data-num="5"></td><td><pre></pre></td></tr><tr><td data-num="6"></td><td><pre>x <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">1</span><span class="token punctuation">,</span><span class="token number">3</span><span class="token punctuation">,</span><span class="token number">5</span><span class="token punctuation">,</span><span class="token number">7</span><span class="token punctuation">,</span><span class="token number">9</span><span class="token punctuation">,</span><span class="token number">11</span><span class="token punctuation">,</span><span class="token number">13</span><span class="token punctuation">]</span></pre></td></tr><tr><td data-num="7"></td><td><pre>y <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">100</span><span class="token punctuation">,</span><span class="token number">111</span><span class="token punctuation">,</span><span class="token number">130</span><span class="token punctuation">,</span><span class="token number">144</span><span class="token punctuation">,</span><span class="token number">149</span><span class="token punctuation">,</span><span class="token number">166</span><span class="token punctuation">,</span><span class="token number">179</span><span class="token punctuation">]</span></pre></td></tr><tr><td data-num="8"></td><td><pre></pre></td></tr><tr><td data-num="9"></td><td><pre><span class="token comment">#线性回归函数为 y=theta0+theta1*x</span></pre></td></tr><tr><td data-num="10"></td><td><pre><span class="token comment">#参数定义</span></pre></td></tr><tr><td data-num="11"></td><td><pre>theta0 <span class="token operator">=</span> random<span class="token punctuation">.</span>uniform<span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">,</span><span class="token number">2</span><span class="token punctuation">)</span><span class="token comment">#对 theata0 随机赋值</span></pre></td></tr><tr><td data-num="12"></td><td><pre>theta1 <span class="token operator">=</span> random<span class="token punctuation">.</span>uniform<span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">,</span><span class="token number">2</span><span class="token punctuation">)</span><span class="token comment">#对 theata1 随机赋值</span></pre></td></tr><tr><td data-num="13"></td><td><pre>alpha <span class="token operator">=</span> <span class="token number">0.1</span><span class="token comment">#学习率</span></pre></td></tr><tr><td data-num="14"></td><td><pre>m <span class="token operator">=</span> <span class="token builtin">len</span><span class="token punctuation">(</span>x<span class="token punctuation">)</span></pre></td></tr><tr><td data-num="15"></td><td><pre>count <span class="token operator">=</span> <span class="token number">0</span></pre></td></tr><tr><td data-num="16"></td><td><pre>loss <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token punctuation">]</span></pre></td></tr><tr><td data-num="17"></td><td><pre></pre></td></tr><tr><td data-num="18"></td><td><pre><span class="token keyword">for</span> num <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span><span class="token number">10000</span><span class="token punctuation">)</span><span class="token punctuation">:</span></pre></td></tr><tr><td data-num="19"></td><td><pre>    count <span class="token operator">+=</span> <span class="token number">1</span></pre></td></tr><tr><td data-num="20"></td><td><pre>    diss <span class="token operator">=</span> <span class="token number">0</span>   <span class="token comment">#误差</span></pre></td></tr><tr><td data-num="21"></td><td><pre>    deriv0 <span class="token operator">=</span> <span class="token number">0</span> <span class="token comment">#对 theata0 导数</span></pre></td></tr><tr><td data-num="22"></td><td><pre>    deriv1 <span class="token operator">=</span> <span class="token number">0</span> <span class="token comment">#对 theata1 导数</span></pre></td></tr><tr><td data-num="23"></td><td><pre>    <span class="token comment">#求导</span></pre></td></tr><tr><td data-num="24"></td><td><pre>    <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span>m<span class="token punctuation">)</span><span class="token punctuation">:</span></pre></td></tr><tr><td data-num="25"></td><td><pre>        deriv0 <span class="token operator">+=</span> <span class="token punctuation">(</span>theta0<span class="token operator">+</span>theta1<span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token operator">-</span>y<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token operator">/</span>m</pre></td></tr><tr><td data-num="26"></td><td><pre>        deriv1 <span class="token operator">+=</span> <span class="token punctuation">(</span><span class="token punctuation">(</span>theta0<span class="token operator">+</span>theta1<span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token operator">-</span>y<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token operator">/</span>m<span class="token punctuation">)</span><span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span></pre></td></tr><tr><td data-num="27"></td><td><pre>    <span class="token comment">#更新 theta0 和 theta1</span></pre></td></tr><tr><td data-num="28"></td><td><pre>    <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span>m<span class="token punctuation">)</span></pre></td></tr><tr><td data-num="29"></td><td><pre>        theta0 <span class="token operator">=</span> theta0 <span class="token operator">-</span> alpha<span class="token operator">*</span><span class="token punctuation">(</span><span class="token punctuation">(</span>theta0<span class="token operator">+</span>theta1<span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token operator">-</span>y<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token operator">/</span>m<span class="token punctuation">)</span> </pre></td></tr><tr><td data-num="30"></td><td><pre>        theta1 <span class="token operator">=</span> theta1 <span class="token operator">-</span> alpha<span class="token operator">*</span><span class="token punctuation">(</span><span class="token punctuation">(</span>theta0<span class="token operator">+</span>theta1<span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token operator">-</span>y<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token operator">/</span>m<span class="token punctuation">)</span><span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span></pre></td></tr><tr><td data-num="31"></td><td><pre>    <span class="token comment">#求损失函数 J (θ)：</span></pre></td></tr><tr><td data-num="32"></td><td><pre>    rand_i <span class="token operator">=</span> random<span class="token punctuation">.</span>randrange<span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">,</span>m<span class="token punctuation">)</span></pre></td></tr><tr><td data-num="33"></td><td><pre>    diss <span class="token operator">=</span> diss <span class="token operator">+</span> <span class="token punctuation">(</span><span class="token number">1</span><span class="token operator">/</span><span class="token punctuation">(</span><span class="token number">2</span><span class="token operator">*</span>m<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">*</span><span class="token builtin">pow</span><span class="token punctuation">(</span><span class="token punctuation">(</span>theta0<span class="token operator">+</span>theta1<span class="token operator">*</span>x<span class="token punctuation">[</span>rand_i<span class="token punctuation">]</span><span class="token operator">-</span>y<span class="token punctuation">[</span>rand_i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token number">2</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="34"></td><td><pre>    loss<span class="token punctuation">.</span>append<span class="token punctuation">(</span>diss<span class="token punctuation">)</span></pre></td></tr><tr><td data-num="35"></td><td><pre></pre></td></tr><tr><td data-num="36"></td><td><pre>    <span class="token comment">#如果误差已经很小，则退出循环</span></pre></td></tr><tr><td data-num="37"></td><td><pre>    <span class="token keyword">if</span> diss <span class="token operator">&lt;=</span> <span class="token number">0.001</span><span class="token punctuation">:</span></pre></td></tr><tr><td data-num="38"></td><td><pre>        <span class="token keyword">break</span></pre></td></tr><tr><td data-num="39"></td><td><pre>    </pre></td></tr><tr><td data-num="40"></td><td><pre><span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string">"本次迭代次数为：&#123;&#125;次，最终得到theta0=&#123;&#125;，theta1=&#123;&#125;"</span><span class="token punctuation">.</span><span class="token builtin">format</span><span class="token punctuation">(</span>count<span class="token punctuation">,</span>theta0<span class="token punctuation">,</span>theta1<span class="token punctuation">)</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="41"></td><td><pre><span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string">"本次迭代得到的回归函数是：y=&#123;&#125;+&#123;&#125;*x"</span><span class="token punctuation">.</span><span class="token builtin">format</span><span class="token punctuation">(</span>theta0<span class="token punctuation">,</span>theta1<span class="token punctuation">)</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="42"></td><td><pre><span class="token comment">#画原始数据图和得到的线性回归函数图</span></pre></td></tr><tr><td data-num="43"></td><td><pre>matplotlib<span class="token punctuation">.</span>rcParams<span class="token punctuation">[</span><span class="token string">'font.sans-serif'</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token string">'SimHei'</span><span class="token punctuation">]</span></pre></td></tr><tr><td data-num="44"></td><td><pre>plt<span class="token punctuation">.</span>plot<span class="token punctuation">(</span>x<span class="token punctuation">,</span>y<span class="token punctuation">,</span><span class="token string">'bo'</span><span class="token punctuation">,</span>label<span class="token operator">=</span><span class="token string">'数据'</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="45"></td><td><pre>plt<span class="token punctuation">.</span>plot<span class="token punctuation">(</span>x<span class="token punctuation">,</span><span class="token punctuation">[</span>theta0<span class="token operator">+</span>theta1<span class="token operator">*</span>x <span class="token keyword">for</span> x <span class="token keyword">in</span> x<span class="token punctuation">]</span><span class="token punctuation">,</span>label<span class="token operator">=</span><span class="token string">'线性回归函数'</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="46"></td><td><pre>plt<span class="token punctuation">.</span>xlabel<span class="token punctuation">(</span><span class="token string">'x'</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="47"></td><td><pre>plt<span class="token punctuation">.</span>ylabel<span class="token punctuation">(</span><span class="token string">'y'</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="48"></td><td><pre>plt<span class="token punctuation">.</span>legend<span class="token punctuation">(</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="49"></td><td><pre>plt<span class="token punctuation">.</span>show<span class="token punctuation">(</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="50"></td><td><pre><span class="token comment">#画损失函数（误差）变化图</span></pre></td></tr><tr><td data-num="51"></td><td><pre>plt<span class="token punctuation">.</span>scatter<span class="token punctuation">(</span><span class="token builtin">range</span><span class="token punctuation">(</span>count<span class="token punctuation">)</span><span class="token punctuation">,</span>loss<span class="token punctuation">)</span></pre></td></tr><tr><td data-num="52"></td><td><pre>plt<span class="token punctuation">.</span>show<span class="token punctuation">(</span><span class="token punctuation">)</span></pre></td></tr></table></figure><p><strong>注意</strong>：因为是随机的，所以每次的结果会不一样。<br>输出结果：<br>本次迭代次数为：159 次，最终得到<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi><mn>0</mn><mo>=</mo><mn>94.02334615793364</mn></mrow><annotation encoding="application/x-tex">theta0=94.02334615793364</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.69444em;vertical-align:0"></span><span class="mord mathnormal">t</span><span class="mord mathnormal">h</span><span class="mord mathnormal">e</span><span class="mord mathnormal">t</span><span class="mord mathnormal">a</span><span class="mord">0</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:.64444em;vertical-align:0"></span><span class="mord">9</span><span class="mord">4</span><span class="mord">.</span><span class="mord">0</span><span class="mord">2</span><span class="mord">3</span><span class="mord">3</span><span class="mord">4</span><span class="mord">6</span><span class="mord">1</span><span class="mord">5</span><span class="mord">7</span><span class="mord">9</span><span class="mord">3</span><span class="mord">3</span><span class="mord">6</span><span class="mord">4</span></span></span></span>，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi><mn>1</mn><mo>=</mo><mn>5.968813991236714</mn></mrow><annotation encoding="application/x-tex">theta1=5.968813991236714</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.69444em;vertical-align:0"></span><span class="mord mathnormal">t</span><span class="mord mathnormal">h</span><span class="mord mathnormal">e</span><span class="mord mathnormal">t</span><span class="mord mathnormal">a</span><span class="mord">1</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:.64444em;vertical-align:0"></span><span class="mord">5</span><span class="mord">.</span><span class="mord">9</span><span class="mord">6</span><span class="mord">8</span><span class="mord">8</span><span class="mord">1</span><span class="mord">3</span><span class="mord">9</span><span class="mord">9</span><span class="mord">1</span><span class="mord">2</span><span class="mord">3</span><span class="mord">6</span><span class="mord">7</span><span class="mord">1</span><span class="mord">4</span></span></span></span><br>本次迭代得到的回归函数是：<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>y</mi><mo>=</mo><mn>94.02334615793364</mn><mo>+</mo><mn>5.968813991236714</mn><mo>∗</mo><mi>x</mi></mrow><annotation encoding="application/x-tex">y=94.02334615793364+5.968813991236714*x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.625em;vertical-align:-.19444em"></span><span class="mord mathnormal" style="margin-right:.03588em">y</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:.72777em;vertical-align:-.08333em"></span><span class="mord">9</span><span class="mord">4</span><span class="mord">.</span><span class="mord">0</span><span class="mord">2</span><span class="mord">3</span><span class="mord">3</span><span class="mord">4</span><span class="mord">6</span><span class="mord">1</span><span class="mord">5</span><span class="mord">7</span><span class="mord">9</span><span class="mord">3</span><span class="mord">3</span><span class="mord">6</span><span class="mord">4</span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:.64444em;vertical-align:0"></span><span class="mord">5</span><span class="mord">.</span><span class="mord">9</span><span class="mord">6</span><span class="mord">8</span><span class="mord">8</span><span class="mord">1</span><span class="mord">3</span><span class="mord">9</span><span class="mord">9</span><span class="mord">1</span><span class="mord">2</span><span class="mord">3</span><span class="mord">6</span><span class="mord">7</span><span class="mord">1</span><span class="mord">4</span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">∗</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:.43056em;vertical-align:0"></span><span class="mord mathnormal">x</span></span></span></span><br><img data-src="https://jiang-hs.github.io/post-images/1590134254515.png" alt="img"><br><img data-src="https://jiang-hs.github.io/post-images/1590134259162.png" alt="img"></p><h2 id="3小批量梯度下降"><a class="anchor" href="#3小批量梯度下降">#</a> 3. 小批量梯度下降</h2><figure class="highlight python"><figcaption data-lang="python"></figcaption><table><tr><td data-num="1"></td><td><pre><span class="token keyword">import</span> matplotlib<span class="token punctuation">.</span>pyplot <span class="token keyword">as</span> plt</pre></td></tr><tr><td data-num="2"></td><td><pre><span class="token keyword">import</span> matplotlib</pre></td></tr><tr><td data-num="3"></td><td><pre><span class="token keyword">from</span> math <span class="token keyword">import</span> <span class="token builtin">pow</span></pre></td></tr><tr><td data-num="4"></td><td><pre><span class="token keyword">import</span> random</pre></td></tr><tr><td data-num="5"></td><td><pre></pre></td></tr><tr><td data-num="6"></td><td><pre>x <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">1</span><span class="token punctuation">,</span><span class="token number">3</span><span class="token punctuation">,</span><span class="token number">5</span><span class="token punctuation">,</span><span class="token number">7</span><span class="token punctuation">,</span><span class="token number">9</span><span class="token punctuation">,</span><span class="token number">11</span><span class="token punctuation">,</span><span class="token number">13</span><span class="token punctuation">]</span></pre></td></tr><tr><td data-num="7"></td><td><pre>y <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">100</span><span class="token punctuation">,</span><span class="token number">111</span><span class="token punctuation">,</span><span class="token number">130</span><span class="token punctuation">,</span><span class="token number">144</span><span class="token punctuation">,</span><span class="token number">149</span><span class="token punctuation">,</span><span class="token number">166</span><span class="token punctuation">,</span><span class="token number">179</span><span class="token punctuation">]</span></pre></td></tr><tr><td data-num="8"></td><td><pre></pre></td></tr><tr><td data-num="9"></td><td><pre><span class="token comment">#目标函数为 y=theta0+theta1*x</span></pre></td></tr><tr><td data-num="10"></td><td><pre><span class="token comment">#参数定义</span></pre></td></tr><tr><td data-num="11"></td><td><pre>theta0 <span class="token operator">=</span> random<span class="token punctuation">.</span>uniform<span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">,</span><span class="token number">2</span><span class="token punctuation">)</span><span class="token comment">#对 theata0 随机赋值</span></pre></td></tr><tr><td data-num="12"></td><td><pre>theta1 <span class="token operator">=</span> random<span class="token punctuation">.</span>uniform<span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">,</span><span class="token number">2</span><span class="token punctuation">)</span><span class="token comment">#对 theata1 随机赋值</span></pre></td></tr><tr><td data-num="13"></td><td><pre>alpha <span class="token operator">=</span> <span class="token number">0.1</span><span class="token comment">#学习率</span></pre></td></tr><tr><td data-num="14"></td><td><pre>m <span class="token operator">=</span> <span class="token builtin">len</span><span class="token punctuation">(</span>x<span class="token punctuation">)</span></pre></td></tr><tr><td data-num="15"></td><td><pre>count <span class="token operator">=</span> <span class="token number">0</span></pre></td></tr><tr><td data-num="16"></td><td><pre>loss <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token punctuation">]</span></pre></td></tr><tr><td data-num="17"></td><td><pre></pre></td></tr><tr><td data-num="18"></td><td><pre><span class="token keyword">for</span> num <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span><span class="token number">10000</span><span class="token punctuation">)</span><span class="token punctuation">:</span></pre></td></tr><tr><td data-num="19"></td><td><pre>    count <span class="token operator">+=</span> <span class="token number">1</span></pre></td></tr><tr><td data-num="20"></td><td><pre>    diss <span class="token operator">=</span> <span class="token number">0</span>   <span class="token comment">#误差</span></pre></td></tr><tr><td data-num="21"></td><td><pre>    deriv0 <span class="token operator">=</span> <span class="token number">0</span> <span class="token comment">#对 theta0 导数</span></pre></td></tr><tr><td data-num="22"></td><td><pre>    deriv1 <span class="token operator">=</span> <span class="token number">0</span> <span class="token comment">#对 theta1 导数</span></pre></td></tr><tr><td data-num="23"></td><td><pre>    <span class="token comment">#求导</span></pre></td></tr><tr><td data-num="24"></td><td><pre>    <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span>m<span class="token punctuation">)</span><span class="token punctuation">:</span></pre></td></tr><tr><td data-num="25"></td><td><pre>        deriv0 <span class="token operator">+=</span> <span class="token punctuation">(</span>theta0<span class="token operator">+</span>theta1<span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token operator">-</span>y<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token operator">/</span>m</pre></td></tr><tr><td data-num="26"></td><td><pre>        deriv1 <span class="token operator">+=</span> <span class="token punctuation">(</span><span class="token punctuation">(</span>theta0<span class="token operator">+</span>theta1<span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token operator">-</span>y<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token operator">/</span>m<span class="token punctuation">)</span><span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span></pre></td></tr><tr><td data-num="27"></td><td><pre>    <span class="token comment">#更新 theta0 和 theta1</span></pre></td></tr><tr><td data-num="28"></td><td><pre>    <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span>m<span class="token punctuation">)</span><span class="token punctuation">:</span></pre></td></tr><tr><td data-num="29"></td><td><pre>        theta0 <span class="token operator">=</span> theta0 <span class="token operator">-</span> alpha<span class="token operator">*</span><span class="token punctuation">(</span><span class="token punctuation">(</span>theta0<span class="token operator">+</span>theta1<span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token operator">-</span>y<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token operator">/</span>m<span class="token punctuation">)</span> </pre></td></tr><tr><td data-num="30"></td><td><pre>        theta1 <span class="token operator">=</span> theta1 <span class="token operator">-</span> alpha<span class="token operator">*</span><span class="token punctuation">(</span><span class="token punctuation">(</span>theta0<span class="token operator">+</span>theta1<span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token operator">-</span>y<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token operator">/</span>m<span class="token punctuation">)</span><span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span></pre></td></tr><tr><td data-num="31"></td><td><pre>    <span class="token comment">#求损失函数 J (θ)</span></pre></td></tr><tr><td data-num="32"></td><td><pre>    rand_ls <span class="token operator">=</span> <span class="token builtin">list</span><span class="token punctuation">(</span><span class="token builtin">range</span><span class="token punctuation">(</span><span class="token number">3</span><span class="token punctuation">)</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="33"></td><td><pre>    <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span><span class="token number">3</span><span class="token punctuation">)</span><span class="token punctuation">:</span></pre></td></tr><tr><td data-num="34"></td><td><pre>        rand_i <span class="token operator">=</span> random<span class="token punctuation">.</span>randrange<span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">,</span>m<span class="token punctuation">)</span></pre></td></tr><tr><td data-num="35"></td><td><pre>        rand_ls<span class="token punctuation">[</span>i<span class="token punctuation">]</span> <span class="token operator">=</span> rand_i</pre></td></tr><tr><td data-num="36"></td><td><pre>    <span class="token keyword">for</span> i <span class="token keyword">in</span> rand_ls<span class="token punctuation">:</span></pre></td></tr><tr><td data-num="37"></td><td><pre>        diss <span class="token operator">=</span> diss <span class="token operator">+</span> <span class="token punctuation">(</span><span class="token number">1</span><span class="token operator">/</span><span class="token punctuation">(</span><span class="token number">2</span><span class="token operator">*</span>m<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">*</span><span class="token builtin">pow</span><span class="token punctuation">(</span><span class="token punctuation">(</span>theta0<span class="token operator">+</span>theta1<span class="token operator">*</span>x<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token operator">-</span>y<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token number">2</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="38"></td><td><pre>    loss<span class="token punctuation">.</span>append<span class="token punctuation">(</span>diss<span class="token punctuation">)</span></pre></td></tr><tr><td data-num="39"></td><td><pre></pre></td></tr><tr><td data-num="40"></td><td><pre>    <span class="token comment">#如果误差已经很小，则退出循环</span></pre></td></tr><tr><td data-num="41"></td><td><pre>    <span class="token keyword">if</span> diss <span class="token operator">&lt;=</span> <span class="token number">0.001</span><span class="token punctuation">:</span></pre></td></tr><tr><td data-num="42"></td><td><pre>        <span class="token keyword">break</span></pre></td></tr><tr><td data-num="43"></td><td><pre>    </pre></td></tr><tr><td data-num="44"></td><td><pre><span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string">"本次迭代次数为：&#123;&#125;次，最终得到theta0=&#123;&#125;，theta1=&#123;&#125;"</span><span class="token punctuation">.</span><span class="token builtin">format</span><span class="token punctuation">(</span>count<span class="token punctuation">,</span>theta0<span class="token punctuation">,</span>theta1<span class="token punctuation">)</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="45"></td><td><pre><span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string">"本次迭代得到的回归函数是：y=&#123;&#125;+&#123;&#125;*x"</span><span class="token punctuation">.</span><span class="token builtin">format</span><span class="token punctuation">(</span>theta0<span class="token punctuation">,</span>theta1<span class="token punctuation">)</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="46"></td><td><pre><span class="token comment">#画原始数据图和得到的线性回归函数图</span></pre></td></tr><tr><td data-num="47"></td><td><pre>matplotlib<span class="token punctuation">.</span>rcParams<span class="token punctuation">[</span><span class="token string">'font.sans-serif'</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token string">'SimHei'</span><span class="token punctuation">]</span></pre></td></tr><tr><td data-num="48"></td><td><pre>plt<span class="token punctuation">.</span>plot<span class="token punctuation">(</span>x<span class="token punctuation">,</span>y<span class="token punctuation">,</span><span class="token string">'bo'</span><span class="token punctuation">,</span>label<span class="token operator">=</span><span class="token string">'数据'</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="49"></td><td><pre>plt<span class="token punctuation">.</span>plot<span class="token punctuation">(</span>x<span class="token punctuation">,</span><span class="token punctuation">[</span>theta0<span class="token operator">+</span>theta1<span class="token operator">*</span>x <span class="token keyword">for</span> x <span class="token keyword">in</span> x<span class="token punctuation">]</span><span class="token punctuation">,</span>label<span class="token operator">=</span><span class="token string">'线性回归函数'</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="50"></td><td><pre>plt<span class="token punctuation">.</span>xlabel<span class="token punctuation">(</span><span class="token string">'x'</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="51"></td><td><pre>plt<span class="token punctuation">.</span>ylabel<span class="token punctuation">(</span><span class="token string">'y'</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="52"></td><td><pre>plt<span class="token punctuation">.</span>legend<span class="token punctuation">(</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="53"></td><td><pre>plt<span class="token punctuation">.</span>show<span class="token punctuation">(</span><span class="token punctuation">)</span></pre></td></tr><tr><td data-num="54"></td><td><pre><span class="token comment">#画损失函数（误差）变化图</span></pre></td></tr><tr><td data-num="55"></td><td><pre>plt<span class="token punctuation">.</span>scatter<span class="token punctuation">(</span><span class="token builtin">range</span><span class="token punctuation">(</span>count<span class="token punctuation">)</span><span class="token punctuation">,</span>loss<span class="token punctuation">)</span></pre></td></tr><tr><td data-num="56"></td><td><pre>plt<span class="token punctuation">.</span>show<span class="token punctuation">(</span><span class="token punctuation">)</span></pre></td></tr></table></figure><p>输出结果：<br>本次迭代次数为：10000 次，最终得到<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi><mn>0</mn><mo>=</mo><mn>96.86808644268262</mn></mrow><annotation encoding="application/x-tex">theta0=96.86808644268262</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.69444em;vertical-align:0"></span><span class="mord mathnormal">t</span><span class="mord mathnormal">h</span><span class="mord mathnormal">e</span><span class="mord mathnormal">t</span><span class="mord mathnormal">a</span><span class="mord">0</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:.64444em;vertical-align:0"></span><span class="mord">9</span><span class="mord">6</span><span class="mord">.</span><span class="mord">8</span><span class="mord">6</span><span class="mord">8</span><span class="mord">0</span><span class="mord">8</span><span class="mord">6</span><span class="mord">4</span><span class="mord">4</span><span class="mord">2</span><span class="mord">6</span><span class="mord">8</span><span class="mord">2</span><span class="mord">6</span><span class="mord">2</span></span></span></span>，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi><mi>h</mi><mi>e</mi><mi>t</mi><mi>a</mi><mn>1</mn><mo>=</mo><mn>5.762687172041142</mn></mrow><annotation encoding="application/x-tex">theta1=5.762687172041142</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.69444em;vertical-align:0"></span><span class="mord mathnormal">t</span><span class="mord mathnormal">h</span><span class="mord mathnormal">e</span><span class="mord mathnormal">t</span><span class="mord mathnormal">a</span><span class="mord">1</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:.64444em;vertical-align:0"></span><span class="mord">5</span><span class="mord">.</span><span class="mord">7</span><span class="mord">6</span><span class="mord">2</span><span class="mord">6</span><span class="mord">8</span><span class="mord">7</span><span class="mord">1</span><span class="mord">7</span><span class="mord">2</span><span class="mord">0</span><span class="mord">4</span><span class="mord">1</span><span class="mord">1</span><span class="mord">4</span><span class="mord">2</span></span></span></span><br>本次迭代得到的回归函数是：<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>y</mi><mo>=</mo><mn>96.86808644268262</mn><mo>+</mo><mn>5.762687172041142</mn><mo>∗</mo><mi>x</mi></mrow><annotation encoding="application/x-tex">y=96.86808644268262+5.762687172041142*x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:.625em;vertical-align:-.19444em"></span><span class="mord mathnormal" style="margin-right:.03588em">y</span><span class="mspace" style="margin-right:.2777777777777778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:.2777777777777778em"></span></span><span class="base"><span class="strut" style="height:.72777em;vertical-align:-.08333em"></span><span class="mord">9</span><span class="mord">6</span><span class="mord">.</span><span class="mord">8</span><span class="mord">6</span><span class="mord">8</span><span class="mord">0</span><span class="mord">8</span><span class="mord">6</span><span class="mord">4</span><span class="mord">4</span><span class="mord">2</span><span class="mord">6</span><span class="mord">8</span><span class="mord">2</span><span class="mord">6</span><span class="mord">2</span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:.64444em;vertical-align:0"></span><span class="mord">5</span><span class="mord">.</span><span class="mord">7</span><span class="mord">6</span><span class="mord">2</span><span class="mord">6</span><span class="mord">8</span><span class="mord">7</span><span class="mord">1</span><span class="mord">7</span><span class="mord">2</span><span class="mord">0</span><span class="mord">4</span><span class="mord">1</span><span class="mord">1</span><span class="mord">4</span><span class="mord">2</span><span class="mspace" style="margin-right:.2222222222222222em"></span><span class="mbin">∗</span><span class="mspace" style="margin-right:.2222222222222222em"></span></span><span class="base"><span class="strut" style="height:.43056em;vertical-align:0"></span><span class="mord mathnormal">x</span></span></span></span><br><img data-src="https://jiang-hs.github.io/post-images/1590134267713.png#pic_l" alt="img"><br><img data-src="https://jiang-hs.github.io/post-images/1590134271990.png#pic_r" alt="img"></p><div class="tags"><a href="/tags/%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD/" rel="tag"><i class="ic i-tag"></i> 人工智能</a> <a href="/tags/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E5%9F%BA%E7%A1%80/" rel="tag"><i class="ic i-tag"></i> 机器学习基础</a></div></div><footer><div class="meta"><span class="item"><span class="icon"><i class="ic i-calendar-check"></i> </span><span class="text">更新于</span> <time title="修改时间：2021-08-25 11:32:03" itemprop="dateModified" datetime="2021-08-25T11:32:03+08:00">2021-08-25</time> </span><span id="posts/202f1f0f/" class="item leancloud_visitors" data-flag-title="梯度下降及线性回归" title="阅读次数"><span class="icon"><i class="ic i-eye"></i> </span><span class="text">阅读次数</span> <span class="leancloud-visitors-count"></span> <span class="text">次</span></span></div><div class="reward"><button><i class="ic i-heartbeat"></i> 赞赏</button><p>请我喝[茶]~(￣▽￣)~*</p><div id="qr"><div><img data-src="/images/wechatpay.png" alt="hang shun 微信支付"><p>微信支付</p></div><div><img data-src="/images/alipay.png" alt="hang shun 支付宝"><p>支付宝</p></div><div><img data-src="/images/paypal.png" alt="hang shun 贝宝"><p>贝宝</p></div></div></div><div id="copyright"><ul><li class="author"><strong>本文作者： </strong>hang shun <i class="ic i-at"><em>@</em></i>航 順</li><li class="link"><strong>本文链接：</strong> <a href="https://jiang-hs.gitee.io/posts/202f1f0f/" title="梯度下降及线性回归">https://jiang-hs.gitee.io/posts/202f1f0f/</a></li><li class="license"><strong>版权声明： </strong>本站所有文章除特别声明外，均采用 <span class="exturl" data-url="aHR0cHM6Ly9jcmVhdGl2ZWNvbW1vbnMub3JnL2xpY2Vuc2VzL2J5LW5jLXNhLzQuMC9kZWVkLnpo"><i class="ic i-creative-commons"><em>(CC)</em></i>BY-NC-SA</span> 许可协议。转载请注明出处！</li></ul></div></footer></article></div><div class="post-nav"><div class="item left"></div><div class="item right"><a href="/posts/d27e233f/" itemprop="url" rel="next" data-background-image="https:&#x2F;&#x2F;pic1.imgdb.cn&#x2F;item&#x2F;64427c710d2dde5777affd97.jpg" title="K最近邻分类算法（KNN）分析及实现"><span class="type">下一篇</span> <span class="category"><i class="ic i-flag"></i> 机器学习基础</span><h3>K最近邻分类算法（KNN）分析及实现</h3></a></div></div><div class="wrap" id="comments"></div></div><div id="sidebar"><div class="inner"><div class="panels"><div class="inner"><div class="contents panel pjax" data-title="文章目录"><ol class="toc"><li class="toc-item toc-level-1"><a class="toc-link" href="#%E4%B8%80-%E6%A2%AF%E5%BA%A6%E4%B8%8B%E9%99%8D"><span class="toc-number">1.</span> <span class="toc-text">一、梯度下降</span></a><ol class="toc-child"><li class="toc-item toc-level-2"><a class="toc-link" href="#1%E6%A2%AF%E5%BA%A6%E4%B8%8B%E9%99%8D%E7%AE%80%E4%BB%8B"><span class="toc-number">1.1.</span> <span class="toc-text">1. 梯度下降简介</span></a></li><li class="toc-item toc-level-2"><a class="toc-link" href="#2%E6%A0%B8%E5%BF%83%E5%85%AC%E5%BC%8F"><span class="toc-number">1.2.</span> <span class="toc-text">2. 核心公式</span></a></li><li class="toc-item toc-level-2"><a class="toc-link" href="#3%E6%A2%AF%E5%BA%A6%E4%B8%8B%E9%99%8D%E6%B3%95%E7%9A%84%E4%B8%80%E8%88%AC%E6%AD%A5%E9%AA%A4"><span class="toc-number">1.3.</span> <span class="toc-text">3. 梯度下降法的一般步骤</span></a></li><li class="toc-item toc-level-2"><a class="toc-link" href="#4%E6%A2%AF%E5%BA%A6%E4%B8%8B%E9%99%8D%E5%AD%98%E5%9C%A8%E7%9A%84%E9%97%AE%E9%A2%98"><span class="toc-number">1.4.</span> <span class="toc-text">4. 梯度下降存在的问题：</span></a></li><li class="toc-item toc-level-2"><a class="toc-link" href="#5%E6%A2%AF%E5%BA%A6%E4%B8%8B%E9%99%8D%E4%B8%89%E5%85%84%E5%BC%9Fbgdsgdmbgd"><span class="toc-number">1.5.</span> <span class="toc-text">5. 梯度下降三兄弟（BGD，SGD，MBGD）</span></a><ol class="toc-child"><li class="toc-item toc-level-3"><a class="toc-link" href="#1%E6%89%B9%E9%87%8F%E6%A2%AF%E5%BA%A6%E4%B8%8B%E9%99%8D%E6%B3%95batch-gradient-descent"><span class="toc-number">1.5.1.</span> <span class="toc-text">①批量梯度下降法（Batch Gradient Descent）</span></a></li><li class="toc-item toc-level-3"><a class="toc-link" href="#2%E9%9A%8F%E6%9C%BA%E6%A2%AF%E5%BA%A6%E4%B8%8B%E9%99%8D%E6%B3%95stochastic-gradient-descent"><span class="toc-number">1.5.2.</span> <span class="toc-text">②随机梯度下降法（Stochastic Gradient Descent）</span></a></li><li class="toc-item toc-level-3"><a class="toc-link" href="#3%E5%B0%8F%E6%89%B9%E9%87%8F%E6%A2%AF%E5%BA%A6%E4%B8%8B%E9%99%8D%E6%B3%95mini-batch-gradient-descent"><span class="toc-number">1.5.3.</span> <span class="toc-text">③小批量梯度下降法（Mini-batch Gradient Descent）</span></a></li></ol></li></ol></li><li class="toc-item toc-level-1"><a class="toc-link" href="#%E4%BA%8C-%E7%BA%BF%E6%80%A7%E5%9B%9E%E5%BD%92"><span class="toc-number">2.</span> <span class="toc-text">二、线性回归</span></a><ol class="toc-child"><li class="toc-item toc-level-2"><a class="toc-link" href="#1%E7%BA%BF%E6%80%A7%E5%9B%9E%E5%BD%92%E7%AE%80%E4%BB%8B"><span class="toc-number">2.1.</span> <span class="toc-text">1. 线性回归简介</span></a></li><li class="toc-item toc-level-2"><a class="toc-link" href="#2%E7%BA%BF%E6%80%A7%E5%9B%9E%E5%BD%92%E5%88%86%E6%9E%90%E7%9A%84%E4%B8%80%E8%88%AC%E5%BD%A2%E5%BC%8F"><span class="toc-number">2.2.</span> <span class="toc-text">2. 线性回归分析的一般形式</span></a></li><li class="toc-item toc-level-2"><a class="toc-link" href="#3%E5%9B%9E%E5%BD%92%E5%88%86%E6%9E%90%E5%8F%AF%E4%BB%A5%E8%A7%A3%E5%86%B3%E7%9A%84%E9%97%AE%E9%A2%98"><span class="toc-number">2.3.</span> <span class="toc-text">3. 回归分析可以解决的问题</span></a></li></ol></li><li class="toc-item toc-level-1"><a class="toc-link" href="#%E4%B8%89-%E4%B8%80%E5%85%83%E7%BA%BF%E6%80%A7%E5%9B%9E%E5%BD%92%E5%87%BD%E6%95%B0%E6%8E%A8%E5%AF%BC%E8%BF%87%E7%A8%8B"><span class="toc-number">3.</span> <span class="toc-text">三、一元线性回归函数推导过程</span></a></li><li class="toc-item toc-level-1"><a class="toc-link" href="#%E5%9B%9B-%E4%BB%A3%E7%A0%81%E5%AE%9E%E7%8E%B0"><span class="toc-number">4.</span> <span class="toc-text">四、代码实现</span></a><ol class="toc-child"><li class="toc-item toc-level-2"><a class="toc-link" href="#1%E6%89%B9%E9%87%8F%E6%A2%AF%E5%BA%A6%E4%B8%8B%E9%99%8D"><span class="toc-number">4.1.</span> <span class="toc-text">1. 批量梯度下降</span></a></li><li class="toc-item toc-level-2"><a class="toc-link" href="#2%E9%9A%8F%E6%9C%BA%E6%A2%AF%E5%BA%A6%E4%B8%8B%E9%99%8D"><span class="toc-number">4.2.</span> <span class="toc-text">2. 随机梯度下降</span></a></li><li class="toc-item toc-level-2"><a class="toc-link" href="#3%E5%B0%8F%E6%89%B9%E9%87%8F%E6%A2%AF%E5%BA%A6%E4%B8%8B%E9%99%8D"><span class="toc-number">4.3.</span> <span class="toc-text">3. 小批量梯度下降</span></a></li></ol></li></ol></div><div class="related panel pjax" data-title="系列文章"><ul><li class="active"><a href="/posts/202f1f0f/" rel="bookmark" title="梯度下降及线性回归">梯度下降及线性回归</a></li><li><a href="/posts/d27e233f/" rel="bookmark" title="K最近邻分类算法（KNN）分析及实现">K最近邻分类算法（KNN）分析及实现</a></li><li><a href="/posts/30c02801/" rel="bookmark" title="KNN算法实现鸢尾花数据集的分类">KNN算法实现鸢尾花数据集的分类</a></li><li><a href="/posts/2afaae3d/" rel="bookmark" title="K-means算法">K-means算法</a></li><li><a href="/posts/fe5ae0e7/" rel="bookmark" title="基于矩阵分解的推荐算法">基于矩阵分解的推荐算法</a></li><li><a href="/posts/a10feb4a/" rel="bookmark" title="协同过滤算法">协同过滤算法</a></li><li><a href="/posts/7ca31f7/" rel="bookmark" title="神经网络">神经网络</a></li><li><a href="/posts/c6767314/" rel="bookmark" title="卷积神经网络">卷积神经网络</a></li><li><a href="/posts/20417848/" rel="bookmark" title="循环神经网络">循环神经网络</a></li></ul></div><div class="overview panel" data-title="站点概览"><div class="author" itemprop="author" itemscope itemtype="http://schema.org/Person"><img class="image" itemprop="image" alt="hang shun" data-src="/images/avatar.jpg"><p class="name" itemprop="name">hang shun</p><div class="description" itemprop="description">世中逢尔，雨中逢花</div></div><nav class="state"><div class="item posts"><a href="/archives/"><span class="count">45</span> <span class="name">文章</span></a></div><div class="item categories"><a href="/categories/"><span class="count">10</span> <span class="name">分类</span></a></div><div class="item tags"><a href="/tags/"><span class="count">25</span> <span class="name">标签</span></a></div></nav><div class="social"><span class="exturl item github" data-url="aHR0cHM6Ly9naXRodWIuY29tL0pJQU5HLUhT" title="https:&#x2F;&#x2F;github.com&#x2F;JIANG-HS"><i class="ic i-github"></i></span> <span class="exturl item zhihu" data-url="aHR0cHM6Ly93d3cuemhpaHUuY29tL3Blb3BsZS9odWktc2h1bi14aW4tbGl1" title="https:&#x2F;&#x2F;www.zhihu.com&#x2F;people&#x2F;hui-shun-xin-liu"><i class="ic i-zhihu"></i></span> <span class="exturl item music" data-url="aHR0cHM6Ly9tdXNpYy4xNjMuY29tLyMvdXNlci9ob21lP2lkPTE4MzkwMTczMzI=" title="https:&#x2F;&#x2F;music.163.com&#x2F;#&#x2F;user&#x2F;home?id&#x3D;1839017332"><i class="ic i-cloud-music"></i></span> <span class="exturl item bilibili" data-url="aHR0cHM6Ly9zcGFjZS5iaWxpYmlsaS5jb20vMzIxMTYyNDg1" title="https:&#x2F;&#x2F;space.bilibili.com&#x2F;321162485"><i class="ic i-bilibili"></i></span></div><ul class="menu"><li class="item"><a href="/" rel="section"><i class="ic i-home"></i>首页</a></li><li class="item"><a href="/about/" rel="section"><i class="ic i-user"></i>关于</a></li><li class="item dropdown"><a href="javascript:void(0);"><i class="ic i-feather"></i>文章</a><ul class="submenu"><li class="item"><a href="/archives/" rel="section"><i class="ic i-list-alt"></i>归档</a></li><li class="item"><a href="/categories/" rel="section"><i class="ic i-th"></i>分类</a></li><li class="item"><a href="/tags/" rel="section"><i class="ic i-tags"></i>标签</a></li></ul></li><li class="item"><a href="/friends/" rel="section"><i class="ic i-heart"></i>友達</a></li><li class="item"><a href="/movie/" rel="section"><i class="ic i-play"></i>movie</a></li><li class="item"><a href="/music/" rel="section"><i class="ic i-music"></i>music</a></li></ul></div></div></div><ul id="quick"><li class="prev pjax"></li><li class="up"><i class="ic i-arrow-up"></i></li><li class="down"><i class="ic i-arrow-down"></i></li><li class="next pjax"></li><li class="percent"></li></ul></div></div><div class="dimmer"></div></div></main><footer id="footer"><div class="inner"><div class="widgets"><div class="rpost pjax"><h2>随机文章</h2><ul><li class="item"><div class="breadcrumb"><a href="/categories/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E5%9F%BA%E7%A1%80/" title="分类于 机器学习基础">机器学习基础</a></div><span><a href="/posts/30c02801/" title="KNN算法实现鸢尾花数据集的分类">KNN算法实现鸢尾花数据集的分类</a></span></li><li class="item"><div class="breadcrumb"><a href="/categories/%E8%AE%BA%E6%96%87%E7%B2%BE%E8%AF%BB/" title="分类于 论文精读">论文精读</a> <i class="ic i-angle-right"></i> <a href="/categories/%E8%AE%BA%E6%96%87%E7%B2%BE%E8%AF%BB/%E6%96%B0%E9%97%BB%E6%8E%A8%E8%8D%90/" title="分类于 新闻推荐">新闻推荐</a> <i class="ic i-angle-right"></i> <a href="/categories/%E8%AE%BA%E6%96%87%E7%B2%BE%E8%AF%BB/%E6%96%B0%E9%97%BB%E6%8E%A8%E8%8D%90/%E9%9A%90%E7%A7%81%E4%BF%9D%E6%8A%A4%E6%96%B0%E9%97%BB%E6%8E%A8%E8%8D%90/" title="分类于 隐私保护新闻推荐">隐私保护新闻推荐</a></div><span><a href="/posts/cc988ffe/" title="UA-FedRec：对联邦新闻推荐的无目标攻击">UA-FedRec：对联邦新闻推荐的无目标攻击</a></span></li><li class="item"><div class="breadcrumb"></div><span><a href="/posts/9eceac59/" title="基于时空注意力的地点推荐模型学习笔记">基于时空注意力的地点推荐模型学习笔记</a></span></li><li class="item"><div class="breadcrumb"></div><span><a href="/posts/ae4d52e6/" title="顺序价格机制的强化学习">顺序价格机制的强化学习</a></span></li><li class="item"><div class="breadcrumb"><a href="/categories/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E5%9F%BA%E7%A1%80/" title="分类于 机器学习基础">机器学习基础</a></div><span><a href="/posts/c6767314/" title="卷积神经网络">卷积神经网络</a></span></li><li class="item"><div class="breadcrumb"><a href="/categories/%E7%94%9F%E6%88%90%E6%A8%A1%E5%9E%8B/" title="分类于 生成模型">生成模型</a></div><span><a href="/posts/b2bcd60d/" title="一文读懂主流生成模型GAN、VAE、DM和DILL·E2等">一文读懂主流生成模型GAN、VAE、DM和DILL·E2等</a></span></li><li class="item"><div class="breadcrumb"></div><span><a href="/posts/e4142071/" title="GAN网络-简单明了">GAN网络-简单明了</a></span></li><li class="item"><div class="breadcrumb"><a href="/categories/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E5%9F%BA%E7%A1%80/" title="分类于 机器学习基础">机器学习基础</a></div><span><a href="/posts/7ca31f7/" title="神经网络">神经网络</a></span></li><li class="item"><div class="breadcrumb"></div><span><a href="/posts/7c185f0e/" title="基于深度强化学习的长期推荐系统">基于深度强化学习的长期推荐系统</a></span></li><li class="item"><div class="breadcrumb"></div><span><a href="/posts/898b5689/" title="强化学习之Policy Gradient">强化学习之Policy Gradient</a></span></li></ul></div><div><h2>最新评论</h2><ul class="leancloud-recent-comment"></ul></div></div><div class="status"><div class="copyright">&copy; 2020 – <span itemprop="copyrightYear">2023</span> <span class="with-love"><i class="ic i-sakura rotate"></i> </span><span class="author" itemprop="copyrightHolder">hang shun @ hang shun</span></div><div class="count"><span class="post-meta-item-icon"><i class="ic i-chart-area"></i> </span><span title="站点总字数">267k 字</span> <span class="post-meta-divider">|</span> <span class="post-meta-item-icon"><i class="ic i-coffee"></i> </span><span title="站点阅读时长">4:02</span></div><div class="powered-by">基于 <span class="exturl" data-url="aHR0cHM6Ly9oZXhvLmlv">Hexo</span> & Theme.<span class="exturl" data-url="aHR0cHM6Ly9naXRodWIuY29tL2FtZWhpbWUvaGV4by10aGVtZS1zaG9rYQ==">Shoka</span></div></div></div></footer></div><script data-config type="text/javascript">var LOCAL={path:"posts/202f1f0f/",favicon:{show:"(´Д｀)被发现了！",hide:"（●´3｀●）我藏好了~"},search:{placeholder:"文章搜索",empty:"关于 「 ${query} 」，什么也没搜到",stats:"${time} ms 内找到 ${hits} 条结果"},valine:!0,copy_tex:!0,katex:!0,fancybox:!0,copyright:'复制成功，转载请遵守 <i class="ic i-creative-commons"></i>BY-NC-SA 协议。',ignores:[function(e){return e.includes("#")},function(e){return new RegExp(LOCAL.path+"$").test(e)}]}</script><script src="https://cdn.polyfill.io/v2/polyfill.js"></script><script src="//cdn.jsdelivr.net/combine/npm/pace-js@1.0.2/pace.min.js,npm/pjax@0.2.8/pjax.min.js,npm/whatwg-fetch@3.4.0/dist/fetch.umd.min.js,npm/animejs@3.2.0/lib/anime.min.js,npm/algoliasearch@4/dist/algoliasearch-lite.umd.js,npm/instantsearch.js@4/dist/instantsearch.production.min.js,npm/lozad@1/dist/lozad.min.js,npm/quicklink@2/dist/quicklink.umd.js"></script><script src="/js/app.js?v=0.0.0"></script></body></html><!-- rebuild by hrmmi -->